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PERFORMANCE MONITORING 
AND OPTIMIZATION 


After reading this chapter and completing the exercises, 
you will be able to: 


+ Effectively use performance monitoring tools 

+ Establish a baseline 

@ Recognize acceptable and unacceptable performance thresholds 
+ Provide solutions to performance bottlenecks 


M onitoring server performance is a critical function in the enterprise 
network. Effectively monitoring a server is a scientific process that 
separates perceived server “slowdowns” from true performance degradation 
based on empirical data. The data you collect is also useful when proposing 
major changes to the enterprise network architecture or new equipment. 
Using the proper tools will help you to obtain accurate data. Each respective 
network operating system (NOS) discussed in this book has its own set of 
monitoring and performance tools, and you should know what those tools 
are and how to use them. 


Whichever tool applies to your operating system, it is critical to establish a 
baseline. The baseline is the fulcrum that balances subjective perceptions of 
network performance against objective data, and establishes what is to be 
considered acceptable performance. For systems that do not perform within 
acceptable parameters, you must accurately determine which server or net- 
work components are causing the bottleneck, and take appropriate action. 
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MONITORING THE SERVER 


Its Monday morning. You sit down at your desk and the telephone rings with a com- 
plaint that logon is slow. You attempt to log on to your own computer but before you 
can type a password, another call comes. Same problem. The moment you hang up, the 
telephone rings again and two co-workers are at your cube wanting to know why they 
can’t log on. Your best response at this point is, “I’m working on it.” 


So where do you start? Luckily, you have established performance baselines for the logon 
server. Therefore, you begin by checking the current performance of the server against 
the baseline. Your Windows 2000 Performance Monitor tells you that the amount of 
memory in use is extremely high compared with the baseline. You identify the process 
consuming the memory and discover that the new monitoring agents installed on this 
server are using more resources than predicted. You terminate the processes temporarily 
until more memory can be added to the server. Within 10 minutes, the problem is solved 
and users are logging on quickly. 


Performance monitoring—observing, measuring, and recording the performance of 
critical server and network resources—is essential for troubleshooting and maintaining 
a network. There are several reasons to monitor servers: 


a To become familiar with your server’s “normal” performance so you know 
when there is a problem 


a To notice impending problems and prevent them before they occur 
a To pinpoint existing problems and identify solutions 
a To aid in resource and capacity planning 


Performance monitoring is the best tool for systematic troubleshooting, capacity plan- 
ning, and checking on the “health” of servers. It can mean the difference between being 
unprepared when a problem comes up, or anticipating a problem and correcting it before 
users even notice. 


Table 11-1 shows some typical server areas that can be monitored. 


Table 11-1 Server Monitoring Activities 
Monitor ... To Determine 


CPU CPU utilization and performance 


RAM Memory shortage or damaged memory 


Hard disk Disk performance, capacity, and errors 
Paging Page file size and performance 


Caching Cache allocation and performance 


Process Hung or stopped service or process using high CPU resources 


Number of users logged on and types of resources they are accessing 
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The operating systems discussed in this book use different tools to monitor the perfor- 
mance of server components listed in Table 11-1, but they are used in similar ways: to 
establish performance baselines, to measure current performance (and perhaps compare 
it to a baseline), and to keep logs of performance over time. 


There are many, many possible performance measures and results, and performance 
monitoring can, at first glance, be a little overwhelming. The key is to focus on the most 

significant resources. With performance monitoring, “less is more." Focusing your 

attention on the most critical resources will help to achieve the most effective results. 


UsiING MONITORING TOOLS 


The performance monitoring concepts presented in the previous section are consistent 
across all platforms. The specific tools and functions vary according to operating system. 
We will discuss tools for: 


a IBM OS/2 
a Linux 


a NetWare 


a Windows NT 4.0 


a Windows 2000 


The main focus of performance monitoring should be the accurate gathering 
and interpretation of performance data regardless of which operating system 

note | Or tool you use. This chapter provides more in-depth coverage of the moni- 
toring tools for Windows NT 4.0 and Windows 2000 as examples, rather than 
detailing all monitoring functions for all of the operating systems. 


Among all the NOSs, there are literally hundreds of different measures of performance. 
Although it is not practical to define each performance measure, you should be aware 
of the main tools and resource categories for each operating system. After a brief tour 
of the primary performance monitoring tools in several operating systems, you'll learn 
about establishing a baseline and how to use monitoring results for troubleshooting and 
planning. This chapter also helps you to identify major performance bottlenecks and 
propose solutions for each area. 
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IBM OS/2 


System Performance Monitor/2 (SPM/2) is designed to analyze hardware and software 
in the OS/2 environment. SPM/2’s primary features for monitoring critical resources are: 


a SPM/2 Monitor—A Presentation Manager application that displays perfor- 
mance data in graph form. Data is summarized from real-time input. 


a Data Collection Facility—A tool that gathers data for system resources in use. 
The information can be displayed with SPM/2 Monitor in the Presentation 
Manager window as graphic or real-time output. The data can also be logged 
to a file using the Logging Facility. 


a Report Facility—A program that generates reports from collected data and is far 
more detailed than SPM/2 Monitor. Data collected into the Report Facility 
can be displayed or exported in three formats: summary, tabular, or dump. 


a Logging Facility—A tool that accesses data from the Data Collection Facility 
and saves it to log files. Log files are available to the Report Facility. 


SPM/2 uses a distributed management approach to performance monitoring. The SPM/2 
application is installed on a monitoring station. Servers designated for monitoring collect 
data and distribute it back to the monitoring station for analysis as shown in Figure 11-1. 
The advantage of this approach, not only for OS/2 but for any NOS that supports it, is that 
the overhead required to run the monitoring software does not skew the monitoring results. 
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Figure 11-14 SPM/2 uses a distributed feature to remotely monitor servers 
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Linux 


UNIX tools are used to monitor performance on Linux systems. These tools are command- 
line utilities that provide statistics on CPU usage, memory, disk I/O, and network connec- 
tions. Although there are multiple UNIX/Linux utilities available for specific monitoring 
purposes, some of the most commonly used are shown in Table 11-2. 


Table 11-2 Common Linux/UNIX Performance Tools 
Function 


Provides information on memory usage, CPU, and interrupts 


Lists all processes current running on the system 
Lists disk space used and available 


Shows top several processes running and amount of resources they consume 


root@localhost.localdomain: /root 
| Fite Edit Settings Help 


[root@localhost /root]# vmstat 10 6 

procs memory 
mab swpd free buff cache 
T 99800 13724 75760 
0 99796 13724 75760 
2 99796 13724 75760 
2 99616 13724 75760 
3 98408 13724 75764 
000 0 98024 13724 75764 
[root@localhost /root]# ps 

PID TTY TIME CMD 
1652 pts/0 00300300 bash 
1667 pts/0 00300300 ps 
[root@localhost /root]# df 

1k-blocks Used Available Usež Mounted on 
2321632 2035464 168236 93% / 
21929 5061 15736 25% /boot 

[root@localhost /root]# 


Figure 11-2 UNIX utilities vmstat, ps, and df provide a snapshot of current system activity 


The vmstat tool provides real-time performance statistics for several resources. For exam- 
ple, when system performance slows, you can use vmstat to provide a quick snapshot of 
CPU load average to determine what process is causing a bottleneck. The syntax for the 
utility looks like this: 


vmstat seconds #OfReports 


If you wanted to take a snapshot every 10 seconds and create a total of six reports, you 
would type vmstat 10 6, which is what Figure 11-2 shows. If you do not specify the num- 
ber of reports, the utility runs continuously until you issue the Ctrl+C command. 
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In addition to displaying the top consuming resources, the top command also provides 
other information such as the number of users logged on, the amount of memory con- 
sumed, and how much is swapped out to the swap file. Figure 11-3 shows sample out- 
put of the top command. 


File Edit Settings Help 


, load average: 0,16, 0.18, 0,15 
92 sleeping, 2 running, 0 zombie, 0 stopped 
7.4% user, 13.0% system, 0.0% nice, 79.4% idle 
179984K used, 77612K free, 185008K shrd, 11156K buff 
OK used, 105800K free 


PID USER j 
1201 root 4692 4692 
1024 root 26596 25M 
1316 root al 4792 4792 
1195 root 3684 3694 
1336 root 1068 1068 
1093 root 4468 4468 
1187 root 4228 4228 
1189 root 3840 3840 
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Figure 11-3 The UNIX top command provides a comprehensive snapshot of ongoing 
system activity 


The top command automatically refreshes. This can be advantageous when you are trou- 
bleshooting, but keep in mind that the refresh itself consumes resources. Remember the 
additional load when using the top command in heavily loaded systems. 


If you also use graphical UNIX/Linux utilities, several other performance monitoring tools 
might be available to you, including the GNOME System Monitor (Figure 11-4) and the 
Stripchart Plotter (Figure 11-5), which provides a quick graphical snapshot of processor, 
swap file, network, and PPP activity. 


Third-party tools that provide a graphical interface for monitoring are also useful. For exam- 
ple, Computer Associates’ Unicenter TNG, an enterprise management software package, pro- 
vides a graphical interface to monitor performance on Linux as well as most flavors of UNIX. 
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Figure 11-5 The Stripchart Plotter 


NetWare 


Novell NetWare uses a service known as Traffic Manager to monitor network traffic. 
Traffic Manager runs on Windows NT computers and uses Windows NT’s Performance 
Monitoring tool to display its data (see next topic). 


The Monitor utility is included with NetWare to track server performance (Figure 11-6). 
Many Novell system administrators will leave this screen on instead of a conventional 
screen saver! When Monitor is running, four performance indicators are shown: 


a Utilization: This shows the CPU utilization rate for servicing network 
requests. If this number is consistently greater than 50-65%, your CPU is a 
bottleneck. (Specific thresholds are discussed later in this chapter.) 


a Total Cache Buffers: If this number is quite low, your system will suffer from 
slow file performance. 
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General Information 
Utilization: z 
Server up time: 0:02:49:21 
Online processors: 1 
Original cache buffers: 15,325 
Total cache buffers: 6,658 
Dirty cache buffers: 
Long term cache hits: 99% 
Current disk requests: a 
Packet receive buffers: 132 
Directory cache buffers: 29 
Maximum service processes: 130 
Current service processes: 6 
Current connections: 2 
Open files: 172 

RES open/lock activity 
¥|Disk cache utilization 


Figure 11-6 The Monitor utility monitors performance under NetWare 


m Current Service Processes: This indicates outstanding read requests. If a read 
request is buffered, it means that resources were not available. This may 
indicate that you need to upgrade your disk controller. 


a Packet Receive Buffers: This is an indicator that shows packets that are being 
buffered from workstations. 


For a GUI, use the Java-based ConsoleOne. 


Administrators running web or FTP services on NetWare servers will probably rely on 
the Novell Internet Caching System (ICS) utility to track and optimize performance 
using the ICS caching facility, but it can also be useful for monitoring general server 
performance and network activity (see Figure 11-7). 


oon SE 


Summary | Services | Performance | Cache | ICP | Cluster | FTP | Cache Logs | Top Ten Sites | 


Howe General Status 
identifier 00000000 
oO Version 1.2 (1.2.72) 
saat Total memory (MB) 256 
Cache disk space (MB) 13,312 
Network Start time Tue Jan 11 14:12:38 MST 2000 


Up time O Days 1 Hours 29 Minutes 22 Seconds 


Summary State 


CPU utilization (%) 
7 Cache hitrate (%) 
Hierarchy Disk space utilization (%) 
Throughput bytes/second 
Requests/second 
Connections 


Objects cached 


Figure 11-7 The Novell ICS utility 
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Windows NT 4.0 


Performance monitoring on the Windows NT 4.0 operating system uses the Performance 
Monitor GUI tool. This tool uses objects, instances, and counters to measure performance 
on local servers or remote systems. Open Performance Monitor on a Windows NT system 
by clicking Start, pointing to Programs, pointing to Administrative Tools, and then clicking 
the Performance Monitor icon. 


For best results, monitor an NT server from an NT workstation. Running Performance 


i: Monitor locally on the server creates an artificial load that can skew performance data. 
(Remotely monitoring a server can also add to the network load, although the impact 


is generally minimal.) 


Chart View is the default view for Performance Monitor and provides real-time dynamic 
snapshots of server activity (see Figure 11-8). Snapshots are taken in one-second inter- 
vals by default. 

ês Performance Monitor MAE 


File Edit View Options Help 
Baoa +x sE al 
100 


al A y \ l 
^A ee AR AA i| TAZA EAN ANN 
lLast[ 35415 Average| 35592 Min) 34.450 Max) 36900 Graph Time] 100.000 
Color Scale Counter Instance Parent Object Computer 
1.000 % Processor Time 0 Processor \\INFINITI 
=] 1,000 % Interrupt Time o ~ Processor \\INFINITI 
J 1.000 % User Time o = Processor ANINFINITI 
=l 1,000 _% Privileged Time, o ~ Processor ANNFINITI 


1.000% Committed Bytes in Use Memory INFINITI 
Data: Curent Activity 


Figure 11-8 Chart View is the default for Windows NT 4.0 Performance Monitor 


Performance is measured by choosing objects to monitor. In Performance Monitor, objects 
are resources such as Processor, Memory, PhysicalDisk, and Network Segment. After select- 
ing an object to monitor, you choose specific counters, which are measures pertaining to 
Performance Monitor objects. For example, when monitoring the Processor object, you 
must specify exactly what it is about the processor you want to monitor by selecting a 
counter. Some examples of Processor counters are: 


a % Processor Time—A primary indicator of overall processor activity. 
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a Interrupts/Sec—The average number of hardware interrupts the processor 
receives and services per second. During an interrupt, normal processes 
owned by applications, services, and so forth are unable to perform actions, 
so you want to be sure to watch this counter. 


a % User Time—The percentage of processor time spent in user mode, which 
includes applications, environment subsystems, and integral subsystems. Despite 
the use of the word “user,” this counter is not always tied to user activities per se. 


a % Privileged Time—The percentage of processor time spent in privileged 
mode, which is designed for hardware driver activity and operating system 
components. A high percentage might indicate a failing hardware device or 
driver that sends out excessive interrupts. 


If there is more than one processor on the system, the Processor object will also have 
multiple instances to distinguish one processor from the other. Instances also apply to 
other resources such as multiple hard disks or multiple NICs. Using instances provides 
the capability to monitor processors or other components collectively or individually. 


You can add objects and counters to the chart that are relevant to the tasks performed 
by the server. (Information concerning how to determine these objects is presented later 
in this chapter.) 


You can also save settings for objects and counters so that you can return to monitor 
the same objects and counters at a later date. The simplest way is to save a Performance 
Monitor file. Once your chart is set, press the F12 key. Enter a name for the file in the 
Save As dialog box (see Figure 11-9). 


Performance Monitor - Save As | 2 [x] 
Savein [£ PerfMon Files x fai] E 
File name: Server, Perf 
Save as type: | Chart Files {(*.pmc) 7 ] Cancel | 


Figure 11-9 Windows NT Performance Monitor settings are saved with a .pmc 
extension by default 


Windows NT 4.0 also uses performance logs to measure historical performance data and 
to set alerts to call attention to specific threshold breaches. Performance logs are dis- 
cussed more specifically in the Windows 2000 section. 
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Windows 2000 


Performance monitoring in Windows 2000 uses the Microsoft Management Console 
(MMC) graphical interface. The Windows 2000 monitoring tool uses objects, instances, 
and counters in a manner similar to Windows NT 4.0. The steps to open and use the 
Windows 2000 Performance console, like all management tools in Windows 2000, has 
changed from NT 4.0. The Performance console can be opened from Administrative 
Tools or added as a snap-in to an MMC containing other management tools. 


that appears when you want to add a counter. By clicking it, an explanation appears for 


A very handy feature of both Windows NT 4.0 and Windows 2000 is the Explain button 
v the otherwise cryptic counters in the list (see Figure 11-10). 


Add Counters 


C Use local computer counters 


@ Select counters from computer: 


[sacc X | 


Performance object: 
C All counters © Allinstances 
© Select counters from list (© Select instances from list: 


% Interrupt Time 0 
% Privileged Time 

% Processor Time 

% User Time 

APC Bypasses/sec 

DPC Bypasses/sec >| 


% Interrupt Time is the percentage of time the processor spent receiving 
and servicing hardware interrupts during the sample interval. This value is 
an indirect indicator of the activity of devices that generate interrupts, 
such as the system clock, the mouse, disk drivers, data communication 
lines, network interface cards and other peripheral devices. These x| 


Figure 11-10 The Explain button describes each counter 


The Performance console contains two snap-ins: System Monitor, and Performance Logs and 
Alerts. System Monitor provides real-time snapshots of system resources on local or remote 
servers (Figure 11-11).The Performance Logs and Alerts snap-in offers two functions: 


m Performance logs gather historical performance data over a period of time. 


m Performance alerts send messages when designated thresholds are exceeded 
based on dynamic data. 
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Figure 11-11 The Windows 2000 System Monitor gathering real-time performance data 


System Monitor 


The Windows 2000 System Monitor relies on objects and counters to display data in the 
chart. Click the plus sign (+) button in the toolbar above the chart to add objects and coun- 
ters. From the Add Counters dialog box, you can choose to monitor the local server or a 
remote server. Figure 11-12 shows the Performance object PhysicalDisk selected from the 
drop-down list. The counter, % Disk Time, has also been chosen. By reading the informa- 
tion in the Instances list, we can see that the server named “infiniti” has two physical disks. 
The disks are labeled 0 and 1.The Instances box provides the capability to monitor the disks 
individually or collectively. Selecting “_Total” monitors both/all physical disks. 


Add Counters 2x! 
C Use local computer counters | Aad | 
@ Select counters from computer: 

= Close 
Ainfiniti X Le 
a Explain 
Performance object: 
PhysicalDisk X 
C All counters C All instances 
@ Select counters from list @ Select instances from list: 


Figure 11-12 The System Monitor Add Counters dialog box 


Performance Logs and Alerts 


Performance logs monitor resources over a specific period of time. This is termed “his- 
torical performance monitoring.’ The time can be hours, days, or weeks depending on 
the situation. The log records the data. When logging is completed, the data can be dis- 
played in a static format in the System Monitor screen. 
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To begin recording performance logs, you must create a log file: 
1. Click the plus sign (+) button to the left of Performance Logs and Alerts. 
2. Right-click Counter Logs. Select New Log Settings. 
3. Give an intuitive name to the new log. Click OK. 


Objects and counters must be added to the log for historical data just as you add objects 
and counters to System Monitor for real-time data. 


The objects and counters are the same in Performance Logs and Alerts as in 

System Monitor, except that they can be configured to record over a specific 
5c] period of time and can issue alerts upon reaching a specified threshold. 
After choosing the objects and counters and returning to the Counter Log dialog box, 
there are multiple options for the time and frequency to gather data. Notice that the 
counter samples data every 15 seconds by default, because it is expected that perfor- 
mance logs will record over a longer duration. You can adjust the data-sampling inter- 
val as you like, but if you make it too short, the log files can get quite large and 
unmanageable. Figure 11-13 displays the General tab of the Counter Log dialog box. 


General | Log Files | Schedule | 


Current log file name: 
[C:\PerfLogs\WeeklpLog_000001 „blg 
This log begins immediately after you apply changes. 


Counters: 


\NACC\PhysicalDisk(_Totall\% Disk Time 


Remove | 


Sample data every: 


Interval: [15 | Units: | seconds z | 


OK | Cancel | Apply | 


Figure 11-13 The General tab of the Counter Log dialog box 


Besides adding counters from the General tab of the interface, you can also select the 
Log Files tab to specify characteristics of the log file itself, such as maximum size, loca- 
tion, and naming preferences (see Figure 11-14). The Schedule tab shows the total time 
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frame for the log to record data, as compared to the data-sampling interval shown on 
the General tab, which determines the frequency of system snapshots. 


General Log Files | Schedule | 
rm Log file name 
Location: Browse... | 


File name: |WeeklyLog 


IV End file names with: [nnnnnn 7] 
Start numbering at: fi 


Example: 


E ‘\PerfLogs\\WeeklyLog_000001.blg 


Log file type: [Binary File hs | 
Comment: 


Log file size 


(© Maximum limit 


C Limit of: po a 
Lok] cma | aw | 


Figure 11-14 The options for the Log Files tab 


From the Log Files tab, specify the following items: 


a A file location, preferably not on the same disk or system you are monitor- 
ing, so as not to skew the results. 


a An intuitive file name, so that you and others easily identify the file. 


a “End file names with” allows log files to be tagged with sequential numbers 
or dates. 


a The default log file type is binary. Log files can also be saved in formats such as 
.CSV to enable simple import to databases or spreadsheets. 


a Log file size, by default, allows growth potential limited only by the space on 
the hard disk. This can be limited to a specific size with this option. 


Once all parameters are determined, you can start the log manually or schedule it to run 
automatically in the future. Initiate a manual start from the Performance console as follows: 


1. Click the plus sign (+) button on Performance Logs and Alerts. 
2. Click Counter Logs. This displays all eligible logs. 

3. Right-click the log and select Start. 

4. To stop the log manually, right-click the log and select Stop. 
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The options to schedule the log are displayed in Figure 11-15. Start the log according to 
time and date. The log is stopped after a specific period of time has elapsed or at an exact 
time and date. There are also two options to indicate when the log completes. First, you can 
specify that when a log file reaches a scheduled termination, another log file begins. This is 
useful for breaking the log files into smaller, more manageable chunks of data. Second, you 
could run an executable or batch file. For example, you might want to run a .bat file that 
includes the net send command to alert the administrator that the log is complete. 


System Overview Properties q 21 x| 


General | Log Files Schedule | 


rm Start log 
© Manually (using the shortcut menu) 


@ At 10:43:23 PM + on | 97 7/2001 ~] 


m Stop log- 


© Manually (using the shortcut menu) 


C After: fi 4 Units [day zÍ 
@ At 10:49:23 PM H on | 9/20/2001 ~] 
© When the log file is full 


When a log file closes: 
I Start a new log file 


IV Run this command: 


[SNMPTrap.bal Browse... | 
Cancel | Apply | 


Figure 11-15 Use the Schedule tab to start and stop the log 


ESTABLISHING A BASELINE 


One of the most important aspects of monitoring performance is establishing a baseline. 
A baseline is established by recording performance data when a server is healthy, or run- 
ning normally. When problems occur (like the slow logons in the opening scenario), use 
performance monitoring tools to observe dynamic real-time values. Compare the real- 
time output with historic performance data to determine the bottleneck in the system. 
This methodical approach to problem solving will consistently yield faster results than 
a shotgun approach of random fixes based on experience and luck. 


What Is a Bottleneck? 


A bottleneck refers to the delay in transmission of data through the circuits of a computer’s 
system. When you monitor performance to detect a bottleneck, you are looking for the 
resource (processor, memory, etc.) that is causing the delay in transmission of data. 
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Most of us experience bottlenecks every day outside of information technology. Think 
of a freeway with four lanes heading in the same direction. If traffic is moving at 65 mph 
on average and a driver moves into the fast (far left) lane and proceeds at 50 mph, this 
creates a bottleneck. The analogy of a four-lane freeway also relates to servers because 
there are four basic resources to monitor to create a baseline: 


m Processor 

a Memory 

a Disk subsystem 

m Network segment 


It is important to note that these are the basic resources, not a comprehensive, detailed pic- 
ture of a server. We discuss each of these resources in more detail later in this chapter. 


When to Create a Baseline 


The best time to create a baseline is while the server is experiencing maximum activ- 
ity. Referencing the logon scenario again, you would create a baseline for the logon 
server in this network. Record historical performance data during the period of time 
when the most users are logging on to the system (such as 30—60 minutes after people 
arrive at work in the morning). 


If you were creating a baseline for a database server, the timing might be completely dif- 
ferent. Perhaps your company runs the majority of reports on a database server after 
business hours, between 7 P.M. and 9 P.M. This will be the best time to create the base- 
line for that database server. You can begin to see why using scheduled performance 
monitoring (as opposed to manually started monitoring) can be advantageous. In many 
instances, the optimal monitoring time is not within normal business hours. 


What if you are not familiar enough with a server to know the optimal monitoring 
time? In such instances, use historical performance monitoring to discover the busiest 
periods of activity. Then create the baseline for that interval. 


What to Monitor in a Baseline 


Now that you’ve determined when to monitor performance, the next decision is what 
resources to monitor. As previously stated, the basics of Processor, Memory, PhysicalDisk 
and Network Segment are a good place to start. There are exceptions, however, and 
there is often the need for greater detail. 


Returning again to the logon scenario, monitoring the basic resources can provide 
important information for troubleshooting the slow logon problem. But, based on the 
specified problem, we can add more objects and create more relevant data. Table 11-3 
illustrates possible objects and counters for creating a baseline in the logon scenario for 
Windows NT or Windows 2000. 
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Table 11-3 Objects and Counters to Create a Baseline on a Logon Server 
Counter 


Processor % Privileged Time 


Processor % User Time 


Memory % Committed Bytes in Use 


Server Logon Total 


Server Logons/sec 


Network Segment % Network Utilization 


Note that in addition to the “basic four” resources, we’ve chosen to monitor the Server 
object, with counters for logon statistics. These additional resources, when included in 
the baseline, provide specific data about the number of users who normally log on in 
the recorded time. Even more useful is the number of logons per second. This statis- 
tic gives an objective number to use for gauging logon speed. The nature of “fast” or 
“slow” is very subjective. Note also that the PhysicalDisk object is excluded from this 
baseline. Disk activity is not a major factor in the logon process. 


PUTTING THE TOOLS TO Work 
Let’s continue with the logon scenario to walk through how baseline data and moni- 


toring tools can be used to solve a performance problem. Table 11-4 shows the baseline 
measurements for the logon server, and Table 11-5 shows the comparative real-time data 
for the same objects during the Monday morning slowdown. 


Table 11-4 Baseline Data for the Logon Server 
Counter Averages (over 30 minutes) 


Processor % Privileged Time 9% 


Processor % User Time 14% 
Memory % Committed Bytes in Use 37% 


Server Logon Total 510 (total over 30 minutes) 


Server Logons/sec 5 
Network Segment % Network Utilization 36% 
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Table 11-5 Real-Time Data for the Logon Server 
Counter Real-Time Statistics 


Processor % Privileged Time 15% 


Processor % User Time 14% 
Memory % Committed Bytes in Use 39% 


Server Logon Total 1 (one-second snapshot) 


Server Logons/sec 1 
Network Segment % Network Utilization 76% 


The first step in interpreting this data is to look for significant changes. In this case, you 
note the following: 


m Processor: % Privileged Time increased by 6%. 

a Processor: % User Time is unchanged. 

a Memory: % Committed Bytes in Use increased by 2%. 

m Server: Logons/sec decreased to 1. 

a Network Segment: % Network Utilization increased by 40%. 


The most significant changes occurred in the number of logons and network utilization. 
From this data you can safely say that logons definitely are slow and the bottleneck is 
the flow of data on the network interface. Solving the problem will require “drilling 
down” deeper into specifics of network utilization. The value of the baseline is that you 
have eliminated processor time and memory as possible bottlenecks. 


a More specifics and possible resolutions of this scenario are addressed later in 
the chapter. 


Note 


CAPACITY PLANNING 


The baseline measurements for a server can also be used for capacity planning. This is 
the practice of monitoring resources for the purpose of projecting the effect of increas- 
ing or decreasing workload on a server. By measuring the performance of a server under 
current conditions, we can project how it will perform under another set of conditions. 
In the current business environment of mergers and acquisitions, capacity planning 
makes for a smoother IT transition. 


Using our logon scenario, the current network has 750 users. Of these, 510 logged on 
during the performance monitoring that created the baseline in Table 11-4. You learn 
in a meeting that your company has acquired another company of equal size. Your IT 
staff has the task of merging IT departments and will be responsible for user logons and 
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security. You will need to accommodate twice the current number of users on the net- 
work. That means 1500 users logging on. Can your server handle the load? Creating 
baselines for capacity planning will help answer these questions not only for logon 
servers but also for many network resources. 


The sections that follow outline acceptable levels of performance for basic resources 
(processor, memory, disks, and network utilization) and give solutions to improve per- 
formance for each resource. Each of these resources works with the others hand in hand 
and is capable of influencing the behavior of other resources. 


PROCESSOR 


Processor time is measured as a percentage of time that the processor is active, execut- 
ing threads submitted by processes (running programs) on the system. (A thread is a 
main component of an application and is the means by which the application accesses 
memory and processor time). One hundred percent represents constant activity. 


Acceptable Processor Performance 


A processor running constantly at 100% is overworked and server performance will dete- 
riorate rapidly. Acceptable levels of processor activity extend up to 60—65% on a consistent 
basis. Levels exceeding 65% during performance monitoring usually indicate that the 
processor is the bottleneck in the system. However, the specific processor utilization per- 
centage that is acceptable within an organization can vary. For example, perhaps you con- 
sider 65% processor utilization to be acceptable for the intranet web server that company 
employees use. However, for the Internet web transaction server, 65% is way too high, 
because online purchases will take too long and impatient buyers might cancel transactions. 


It is not unusual for the processor to peak or spike higher than 65% for a brief 
period of time. When new processes are started or when services are starting 
Note | after rebooting a server, processor levels spiking to 100% are totally acceptable. 


A bottleneck is indicated when known applications, processes and/or services push proces- 
sor levels beyond 65% for an extended period of time and the processor does not return to 
lower levels until the applications, processes, or services are terminated. 


Processor Solutions 


The following sections present different approaches to improving processor performance. 


Implement SMP 


If the processor is the bottleneck, additional CPUs can be added to a server to improve 
performance and handle increased loads. All major NOSs under discussion in this book 
support symmetric multiprocessing (SMP). Many 32-bit applications can benefit from 
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SMP if the code allows multithreading, which is the ability to run two or more pro- 
gram threads at once. For example, if a program runs two threads on an SMP system 
with two processors, each processor can handle a thread simultaneously. With a single 
processor, the program can still run multiple threads but the processor can only execute 
a single thread at one time. 


a Only the simplest programs run a single thread. Most applications (not only on 


= the server but also on most client workstations) run several threads at once. 


Add Servers 


Sometimes the best solution to a processor bottleneck is to simply add another server, 
especially when a server is performing multiple tasks that may conflict with each other. 
For example, a company may be using a single database server to perform sales transac- 
tions and provide reports based on those transactions. Transactions and queries for 
reports may require multiple reads from tables simultaneously. While adding another 
processor (SMP) may improve performance, a better solution would be to add another 
server dedicated to running queries to create reports. 


Remove Compression 


Compression is storing data in a format that requires less space than usual. Simply stor- 
ing data does not place a greater load on the processor. However, when data is written 
to the compressed partition or folder, the processor must work harder to calculate the 
compression algorithms. Removing compression from partitions or folders where data 
is written frequently can free the processor to perform more critical tasks. The type of 
data that you choose to compress, if any, is also a factor. Some file types do not com- 
press well, and processor utilization will be wasted on these files. For example, multi- 
media files such as movie files and JPEG files do not compress well. 


Remove Unnecessary Encryption 


Encryption uses any of several methods to protect sensitive data from prying eyes. However 
useful, encryption is processor intensive, and places a greater load on the processor. Just as 
with compression, the processor performs calculations to encrypt and decrypt data. The 
operative word in this solution is unnecessary. Security is important and when encryption 
is warranted, the better solution is upgrading or adding additional processors. 


Although administrators are usually adept at understanding encryption, you 
A should be careful about users implementing encryption. There are several encryp- 
caution | tion schemes and utilities available. You do not want users to place encrypted 
data on network resources where server processors must perform the encryption. 
In addition, some encryption schemes can make data permanently inaccessible. 
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Implement Clustering 


Clustering is a solution to performance issues that benefit from load balancing. Clustering 
is connecting two or more computers together in such a way that they behave like a sin- 
gle computer. As a solution to slow processor performance, clustering is essentially adding 
another computer to aggregate performance in addition to providing fault tolerance. 


Remove Software RAID (Especially RAID-5) 


RAID (as defined in Chapter 5) provides fault tolerance and in some cases can actually 
improve performance. Software RAID-5, however, can significantly diminish processor 
performance. As data is written to the hard disk, the processor must calculate the algo- 
rithms for the parity bit that creates fault tolerance. This requires considerable processor 
time and, consequently, other processes may suffer. If the RAID-5 array is primarily for 
reading data, this is not an issue because parity calculations are not performed during 
reads. Hardware RAID-5 does not burden the server CPU because the parity calcula- 
tion occurs on a separate processor designed for RAID functionality. 


Move Processor-Intensive Applications or Services 


Moving applications or services that overwork the processor is called load balancing. It 
includes installing an application or service on a second server, and deleting the appli- 
cation or service from the server that is overworked. For example, if one server is func- 
tioning as both the DHCP and WINS server, install WINS on another server and delete 
WINS from the server with DHCP. 


You can also keep the application or service on the original server and then install it on 
a second server to balance the load between the two servers. This is a common practice 
in web servers. Instead of overloading a single web server, administrators place the same 
web content on two or more other web servers, and the web servers take turns in ser- 
vicing client requests. 


Verify Proper Operation of Applications and Drivers 


When not running normally, applications or bad drivers can cause excessive processor 
utilization. To detect problems with applications, monitor the individual process of the 
application. It will also be useful to monitor the number of threads utilized by the appli- 
cation by using the following object/counter combination: 


m Object: Process 
= Counter: Thread Count 


As an example, in Figure 11-16, Windows 2000 Performance Monitor is monitoring the 
Diskeeper defragmentation utility running over four threads (numbered 0—4) and utilizing 
over 60% processor time. In this case, it was acceptable because I deliberately set Diskeeper 
to run at a high priority and there were no other pressing tasks to run at the time. 
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Figure 11-16 Monitor the number of threads in use and the processor utilization they 
require 


A significant change in the number of threads used by an application, compared to a base- 
line, indicates problems in the application code. Some applications may run multiple 
instances, which can also increase the load on the processor. 


Corrupted device drivers can also demand excessive processor time. An effective mea- 
surement for this problem is to monitor the following object/counter combinations: 


a Object: Processor 
a Counters: % Interrupt Time or Interrupts/sec 


Compare the results to a baseline. Significant increases in the number of interrupts indi- 
cate problems with hardware devices and/or the drivers. For example, a few years ago, 
I had a new file server with the best equipment my company could afford at the time. 
Initially, it performed fine and I largely ignored it except for normal maintenance. One 
day after clearing dust from inside the server and starting it up, it seemed to take quite 
a long time to boot. Then, it took a long time to retrieve even the smallest files from 
the file server. I ran Windows NT Performance Monitor from a different server (so as 
not to skew the results) to record Interrupts/sec. The interrupts were far above the base- 
line for this system, and it turned out that the file server’s NIC was failing. A failing NIC 
(and several other types of hardware) will often issue constant interrupts because it is 
not able to determine that the processor has responded to the interrupt requests. After 
replacing the NIC, the server returned to its normal level of performance. 


A Do not be too concerned if Interrupts/sec is over 100 when idle—the system 


æ= | clock accounts for this by sending regular interrupts every 10 milliseconds. 
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Set Process Priority 


In Windows NT and Windows 2000, you can manually set a process or application to 
run at a specific priority to ensure that it does not dominate processor utilization at the 
expense of other applications. You can also adjust the process priority to force the 
processor to favor the process or application over others. Some applications allow you 
to adjust settings within the application, or you can use Task Manager to configure the 
application priority: 


1. Press Ctrl+Shift+Esc to access Task Manager. 
2. On the Process tab, select the application’s process. 


3. Right-click the process, click Set Priority, and choose a priority from Low to 
Realtime (see Figure 11-17). 


E windows Task Manager 


File Options View Help 


Applications Processes | Performance | 


[mege wame | em | cpu] cpu Time | mem usage |= 


dns.exe 
imonNT.exe 
inetinfo.exe 
promon.exe 
Speedkey.exe 
point32.exe 
mdm.exe 
imontray.exe 
mgayrtcl.exe 
mgavrte.exe 
HPPDDIR.exe 
gbdagent2001.ex 


Snaglt32.exe End Process 

WINWORD.EXI End Process Tree 0: 03: 03 
tmrtmngr.exe Debug 0:00:00 
taskmgr.exe ——— ~~~ | 0:00:00 
Thumbs.exe io Realtime 


ee À 


BE A 
Processes; 46 CPU Usage: 2% | BelowNormal 
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Figure 11-17 Changing the priority level of a process 


Use Realtime priority sparingly, if at all, because if the process runs indefi- 
nitely, then the operating system, other server functions, or applications can 
caution || lock or hang due to a lack of processor time. 
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MEMORY 


The following sections present different approaches to improving memory performance. 
Memory is indirectly one of the most critical performance factors. A low memory con- 
dition causes higher disk utilization due to disk swapping. Low memory also affects sta- 
bility. Some operating system information can never be swapped to hard disk (such as 
passwords or other security-sensitive data). However, if you are out of memory, there is 
no place else for it to go, and the server might fail. It is critical to determine what accept- 
able memory performance is and how to remedy low memory conditions. 


Acceptable Memory Performance 


Defining acceptable memory performance is extremely subjective. What is tolerable to 
one organization may be completely unacceptable to another. Most of the time, mem- 
ory “performance” (that is, speed) is not an issue because it performs in nanoseconds. 
However, not having enough memory clearly is a performance issue, because the NOS 
must then turn to virtual memory paging on the hard disk—the most common bottle- 
neck component in the system. 


The most important counters for memory are: 
a Object: Memory 
= Counter: % Committed Bytes in Use 
= Counter: Page Faults/sec 
= Counter: Available Bytes 
a Counter: Pages/sec 
= Counter: Pool Non-Paged Bytes 


Committed Bytes in Use represents a percentage of the total system memory (physical 
memory plus virtual memory) currently used by processes running on the computer. The 
first rule of thumb is that this percentage should remain relatively constant. Only slight 
variations are acceptable. When processes are stopped or started or the number of con- 
nected users changes, variation is normal. However, a steady increase of committed bytes, 
in the absence of additional processes or user load, frequently indicates a memory leak. 
A memory leak occurs when an application opens a thread but does not close it when the 
application is finished with it. At first, a memory leak will start more paging, which in itself 
deteriorates performance. Later, as both memory and available swap file space become 
more scarce, the memory leak can eventually cause a system crash. 


Available Bytes is the amount of physical memory available to processes running on the 
computer. It reflects the last observed value rather than an average. 
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Pages/sec is a good indicator of excessive paging in a virtual memory system. Paging is 
a technique to help ensure that the data needed is available as quickly as possible. Recall 
from Chapter 8 that the page file is designated space on the hard disk used as memory. 
Each time a page is needed that is not currently in memory, a page fault occurs. When 
this value exceeds 20 per second on a consistent basis, performance will deteriorate. 
Excessive page faults are normally due to insufficient memory or memory leaks. 


Pool Non-Paged Bytes is the number of bytes in the nonpaged pool, an area of system 
memory for objects that cannot be written to disk but must remain in physical mem- 
ory as long as they are allocated. The Registry, for example, cannot be paged to disk. 


Memory Solutions 


All solutions to memory problems fall under one general category: add more memory. 
However, it’s important to remember that all resources work hand-in-hand; no resource 
is independent. Increasing or decreasing one resource will always have some impact on 
others. In the case of excessive paging, adding more memory reduces the need for pag- 
ing and helps to reduce the hard disk bottleneck. 


Add Memory 


Adding memory can never harm anything except the budget. There are many instances in 
which adding memory is the best and easiest solution. The best method to determine when 
additional memory is justified is through performance monitoring. Without performance 
analysis, simply adding more memory can mask a more serious underlying problem. 


Upgrade the Motherboard to Accept More Memory 


All motherboards have a limit on the amount and type of memory that can be installed. 
When this limit is reached, one solution is to upgrade to a motherboard that accom- 
modates more and/or faster memory. Frequently, this is an expensive solution and in 
some instances not cost effective based on advances in technology. For example, a moth- 
erboard that has an older Pentium processor may be limited to 512 MB of RAM. Even 
if an upgraded motherboard accommodated 2 GB of RAM, significant performance 
improvements may not be achieved until the processor is also upgraded. Generally, 
upgrading the motherboard really means replacing the server. The original server 
becomes spare parts or is used in a less demanding role. 


Increase or Optimize Swap File Size 


Swap file space goes by different names but is essentially space designated on a hard disk 
to act as memory. If all physical memory is used and there is not enough swap space, the 
system will report out-of-memory errors. One solution is to increase the swap file space, 
but you should realize that this doesn’t really improve performance, since the system is 
still making slower disk accesses instead of rapid memory accesses. Increasing swap file 
space only prevents more out-of-memory errors. 
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Depending on the current space dedicated to the swap file, increasing its size may be only a 
temporary solution. If the swap space or paging file is increased to meet memory needs, per- 
formance monitoring will also reveal a corresponding increase in the number of page faults. 


Windows-based servers build a page file equal to the amount of RAM by default dur- 
ing installation. NT 4.0 Workstation and Windows 2000 Professional default to 1.5 times 
physical memory because it is assumed that there is less physical memory in client work- 
stations. The swap space or paging file functions best at a size of 300-500 MB. A previ- 
ous rule of thumb was 1.5 to 2 times the size of physical RAM, but with physical 
memory reaching into gigabytes on servers, this number is no longer reasonable. Too 
much space allotted to a swap file leads to fragmentation. (Fragmentation is discussed 
later in the chapter.) 


You can improve overall system performance regarding swapping to the hard disk by 
placing the swap file in an optimum location. By default, most NOSs place the swap file 
on the same hard disk as the operating system itself (often referred to as the system disk). 
Because the system disk is often quite active running normal operating system services, 
the swap file must compete for disk access. If available, it is better to place the swap file 
on a separate physical disk that is less active (another partition on the same disk provides 
no benefit). Better yet, you can split the swap file between multiple disks (for example, 
in RAID-0 striping), so that multiple disks can service swap-file activity at once. 


and all other factors are equal, consider the number of heads the hard disks have. The 


If you are trying to determine which of two disks would be best to store the swap file, 
drive with more heads will perform better. 


If the swap file has the ability to grow, as is the case for Windows operating systems, it 
is better to set a fixed size for the swap file that is large enough to service present and 
future needs. This helps to prevent fragmentation as the swap file adjusts in size. 


As a security precaution, you might consider clearing the swap file when the system 
reboots. If sensitive information is paged to the swap file and someone is able to gain phys- 
ical access to it, they might be able to retrieve information from it. This possibility is 
extremely remote for a number of reasons; however, some highly secure organizations 


require it. Note that clearing the swap file will cause slower shutdown and startup times 
while the system clears and re-creates the swap file. 


Use Faster Memory 


As discussed in Chapter 6, there are different kinds of RAM, some faster than others. 
SDRAM has almost entirely replaced EDO DRAM and is about twice as fast. SDRAM 
is capable of synchronizing with the CPU bus and reaching clock speeds of 133 MHz. 
RDRAM and DDR SDRAM (discussed in Chapter 3) appear to be the next generation 
of high-performance memory, each capable of more than 1 GBps of data throughput. 
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Choosing faster memory usually requires a faster motherboard unless, for example, you 
have 100 MHz SDRAM installed on a 133 MHz bus. In that case, upgrading to 133 MHz 
SDRAM will take advantage of the faster bus speed. 


Distribute Memory-Intensive Applications or Services 


As discussed in respect to processors, the best solution to memory problems can be load 
balancing, because you utilize the hardware resources of another server to alleviate the 
server load. Thorough performance monitoring and analysis will tell you whether this 
is the best approach. When monitoring applications and services, pay close attention to 
which ones are processor intensive and which are memory intensive. A server providing 
basic file/print services will be memory intensive and probably not place significant 
demand on the processor. In contrast, a database server providing report functions and 
servicing multiple queries will be very processor intensive. Familiarity with the relative 
needs of applications and services in your network will assist you in making the most 
efficient distribution of resources. 


Check for Memory Leaks 


Memory leaks were discussed earlier in this chapter in relation to the % Committed Bytes 
in Use counter. This is perhaps the best indicator of a memory leak. Committed bytes should 
remain relatively constant. If they continue to increase gradually over time, yet no additional 
processes are introduced to the server, this is a strong indicator of a memory leak. 


The best long-term solution to a memory leak is to contact the vendor so that it can 
make alterations to the code to stop the leak. Usually, the fix is an update that you can 
download. The short-term solution is to terminate the application, reboot the server, and 
restart the application. This forces the application to free memory no longer being used. 


HarD Disk 


As always, the hard disk seems to be the slowest performing of all the server compo- 
nents. Even with SCSI-3, Fibre Channel, and rotation speeds upward of 15,000 rpm, 
hard disks cannot begin to compete with the speed of the processor and memory. 
However, you can still arrive at an acceptable level of performance given the physical 
limitations of hard disks. 


Acceptable Hard Disk Performance 


Exact thresholds for determining an acceptable speed for the transfer of data from hard 
disks or any storage devices are even more subjective than memory or processor per- 
formance. The key is to obtain baseline numbers on current performance regardless of 
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whether it is perceived to be slow or fast. To determine whether the hard disk is able to 
reasonably keep up with I/O requests, use the following object and counters: 


a Object: PhysicalDisk 

a Counter: % Disk Time 

a Counter: Current Disk Queue Length 
a Counter: Avg. Disk Bytes/ Transfer 


The % Disk Time counter represents the amount of time that the disk services read or 
write requests. You generally want to see less than 50% for this counter. Current Disk 
Queue Length represents the number of outstanding I/O requests waiting for the hard 
disk to become available. If the hard disk is overly taxed, then there will be several out- 
standing requests. You generally want to see no more than two requests queued. This 
counter is an instantaneous view; if you want to check an average over time, PhysicalDisk 
counters such as Avg. Disk Bytes/Transfer are also available. 


Hard Disk Solutions 


If the time comes when hard disk performance is deemed to be unacceptable, you can 
implement solutions such as the ones offered below. 


Add or Replace Hard Disks 


Hardware or software RAID arrays can significantly increase disk performance and provide 
fault tolerance. Hardware RAID is superior to software RAID, but it is also more expen- 
sive. While arguments abound concerning whether software RAID provides true fault tol- 
erance, this is not the forum for that discussion: We're concerned with performance. 


Both hardware and software RAID arrays increase performance by striping data across 
multiple disks. Because multiple drive heads are working simultaneously to write and/or 
read data, transfer speeds will be faster than non-RAID disks. For example, you have a 
software RAID-5 array consisting of three hard disks, and performance is unacceptably 
slow. By adding another disk, you aggregate total performance across four hard disks 
instead of three (assuming you are using SCSI, not IDE). The only potential problem 
might be processor utilization for the parity calculation, in which case you might also 
need to add a processor, upgrade the existing one, or switch to hardware RAID. 


An Software RAID-5 will increase performance on disk reads. Performance suf- 


= || fers on disk writes, however, due to processor-intensive parity calculations. 
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Defragment Disks 


Fragmentation on hard disks occurs through the normal processes of creating, moving, 
copying, and deleting files. The result, over time, is that single files are spread out in 
pieces across the disk. If the condition persists, disk transfer rates deteriorate because the 
drive head must search around and across multiple sectors to read a single file. 
Comparing real-time and baseline disk activity can provide evidence of fragmentation. 


On a Windows NT/2000 server, use the following object and counter to find evidence 
of fragmentation: 


m Object: PhysicalDisk 
= Counter: Disk Read Time 


Defragmentation relocates fragmented files back into a contiguous layout. Running a defrag- 
mentation utility such as Executive Software’s Diskeeper (see Figure 11-18) on a regular 
schedule will yield an appreciable increase in performance. Obviously, defragmentation is 
highly disk intensive, so you should run it only when disk utilization is at its lowest. You can 
set defragmentation to start on a schedule, or configure defragmentation to start automati- 
cally when the hard disk reaches a certain point of fragmentation. In the enterprise, you will 
want to use Diskeeper’s capability to remotely defragment other servers and workstations. 
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Figure 11-18 Diskeeper defragments the hard disk 
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There are not many defragmentation utilities for servers. The most common are 
a Diskeeper (www.executivesoftware.com) and Raxco's PerfectDisk 2000 
Note 


(www.raxco.com). 


Add Faster or Additional Controllers 


Adding controllers can solve performance problems. It is very similar to adding another 
lane to a highway. More controllers accommodate more disks. More disks can balance 
the workload from multiple users. More controllers and/or disks also increase options 
for deploying RAID arrays. 


The most dramatic and fundamental improvement to the disk subsystem is to upgrade 
from IDE disks to SCSI disks. Traditionally SCSI controllers or interfaces have supported 
faster data transfer rates than IDE. The gap has narrowed with the introduction of EIDE, 
ATA-5, and the upcoming ATA-6. (Recall from Chapter 5 that SCSI-3 supports data 
transfer rates up to 320 MBps, while ATA-5 and ATA-6 support data transfer rates up 
to 100 MBps.) 


Distribute Files 


Distributing files works to solve disk performance problems in a way similar to load bal- 
ancing. Frequently accessed files are distributed over multiple servers instead of residing 
on a single server. All network operating systems have some form of distributed files. In 
the Microsoft environment, it is called the Distributed File System (Dfs). Distributing 
files has the following advantages: 


a There is a single access point for users. In Microsoft Dfs, for example, user 
computers map to a single file share point and still access files on multiple 
servers. The share point is the Dfs server, which redirects the requests to the 
appropriate servers. The process is transparent to users and security is main- 
tained no differently than files accessed normally. 


a Distributed files can be a cost-effective performance alternative to adding 
more servers or upgrading processor, memory, and/or disk resources. 


Archive Files to Long-Term Backup Media 


As hard disks exceed 75—80% of capacity, performance starts to deteriorate. When large 
portions of data on a disk must be maintained but not frequently accessed, archiving files 
to long-term storage can both reduce the risk of running out of disk space and improve 
performance. Offline storage is available from many hardware and software vendors, but 
the main idea is that when a given file has not been accessed for specific period of time, 
the file is automatically moved to offline storage, such as an optical drive or tape. Users 
can still access the data, but it arrives more slowly as it is retrieved from the offline stor- 
age media. Windows 2000 Server integrates this capability into the operating system. 
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Check for Disk Errors 


S.M.A.R.T. is an acronym for Self-Monitoring, Analysis and Reporting Technology. It 
is an open standard for developing disk drives and software systems that automatically 
monitor the health of the drive and report potential problems. Potentially, this enables 
proactive solutions to disk errors before actual disk failure. To use S.M.A.R.T., you load 
software that is able to query and accept messages from the $S.M.A.R.T. hard disk. The 
software is often provided by the disk manufacturer and included with the hard disk or 
host adapter. Figure 11-19 shows a hard disk monitoring utility included with Promise 
Technologies’ FastTrak 100 IDE host adapter. 


S FastCheck Monitoring Utility i 1/0) x| 
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= E FastTrak 1 
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= WDC WD300B Location > Array 7 
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Figure 11-19 This hard disk monitoring utility monitors the health of a hard drive using 
S.M.A.R.T. reporting 


NETWORK 


Network performance on a server is contingent on the actual network interface card(s) 
installed and the components connecting the server to the network, such as cabling 
switches and/or hubs. Performance Monitor can only measure the traffic on the NICs 
local to the server. 
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Acceptable Network Performance 


Network utilization is one of the most important network statistics. Most monitoring 
and reporting tools provide network utilization values as their primary reporting vari- 
able. Percentages of up to about 30% network utilization are acceptable. Collision net- 
works (Ethernet) that exceed 30-50% utilization need to be monitored closely to 
prevent a larger increase of traffic that may cause network delays or low throughput. 
Server network utilization measures traffic on a specific NIC, and segment utilization 
measures all traffic on a given segment. Network and server traffic are monitored and 
analyzed separately, but the acceptable values are the same. Overall network utilization 
and server network utilization usually affect each other. 


To measure the server’s ability to send/receive data and handle network requests, mon- 
itor the following objects and counters: 


a Object: Network Segment 
a Counter: % Network Utilization 
a Object: Network Interface 
a Counter: Output Queue Length 


The % Network Utilization counter represents a percentage of network bandwidth in 
use on the network segment. Each NIC will be an instance of the segment. Output 
Queue Length measures the number of packets waiting to put out on the network. 
Values of 1 or 2 are acceptable, but anything higher means your NIC cannot keep up 
with requests. Multihoming (discussed in the next section) can be a workable solution 
in this instance. 


Sart The Network Monitor Agent service must be installed on Windows NT 4.0/2000 for the 
Network Segment object to be available in Performance Monitor. 


Network Solutions 


If you find that network utilization is too high, consider the following solutions. 


Multihoming 


A multihomed server has two or more NICs installed. Each NIC has a unique IP address. 
These addresses can be on the same subnet or segmented in two or more subnets. Multiple 
network cards can notably improve throughput to the server provided network bandwidth 
is not oversubscribed. More throughput can provide users faster access to data. 


There are disadvantages to multihoming some servers because of name resolution issues. 
A server with two or more NICs will have multiple IP addresses but only one name. In 
the Microsoft environment, multihomed servers must be manually designated in the WINS 
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server to guarantee accurate name resolution. DNS names already require manual entry 
unless you are using dynamic DNS (DDNS), available with Windows 2000 or NetWare. 
In that case, hosts can automatically register host names to both IP addresses. 


Port or Link Aggregation 


Introduced in Chapter 7, port aggregation is a software solution to server bandwidth 
bottlenecks. Port aggregation can double bandwidth to a single segment by combining 
throughput of two network cards on a single segment. For example, two 10/100 Mbps 
NICs can be pooled by the software to provide 200 Mbps throughput. Aggregation soft- 
ware also allows a server with multiple NICs to be recognized by a single IP and MAC 
address. Single IP and MAC addresses overcome the naming resolution problems that 
can be encountered by standard multihomed servers. 


Link aggregation is a hardware and software solution similar to port aggregation. With 
link aggregation, there is one NIC with multiple ports to provide the additional band- 
width. Link aggregation also requires that only one driver be installed for the network 
card. This is still a relatively proprietary solution (Intel) and must have compatible 
switches to deliver maximum performance. 


Multihoming, port, and link aggregation only increase the bandwidth to and 
a from the server, but do nothing to increase the overall bandwidth of the net- 
Note | work segment. 


Improve Network Equipment 


Over the past decade, the lower cost and availability of networking equipment has created 
enormous growth in IT for small and medium-sized companies. One of the key pieces of 
equipment is the hub (introduced in Chapter 7). Recall that the hub enables multiple 
nodes to be linked together on a single bus and communicate data to a common destina- 
tion. Switches can replace hubs to improve network throughput and performance. 


A hub with 16 ports accommodates 16 nodes. Using the analogy of highway traffic, 
think of each of those ports as a lane of traffic. The exit to the bus from the hub is a sin- 
gle lane. Consequently, a hub forces multiple lanes of traffic (in this example, 16) to a 
single lane. This results in multiple collisions while contesting for the single lane. 


A switch with 16 ports also accommodates 16 nodes. However, a switch maintains each 
lane of traffic to the exit point on the bus. There are no collisions as the lanes are merged. 
In Ethernet topology, reduced collisions dramatically improve throughput. 


Some networks combine switch and hub technology. Hubs, rather than indi- 
vidual nodes, are connected to ports on a switch. While there is no absolute 

Note right or wrong to network design, without careful analysis of traffic patterns, 
arbitrarily connecting hubs to switches can erode the additional throughput 
provided by the switch. 
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Upgrade NICs 


Ethernet technology as originally developed was capable of transferring data at speeds up 
to 10 Mbps. Accordingly, the components (NICs, hubs, bridges, routers) developed for net- 
works provided the same throughput speed. In the mid-1990s, Fast Ethernet (100 Mbps) 
became more affordable and more prevalent in networks. 


Upgrading a network card from 10 Mbps to 100 Mbps can boost server throughput and 
performance. Also consider utilizing full-duplex NICs instead of half-duplex NICs if you 
are using hubs instead of switches. Recall that full-duplex uses an additional pair of wires 
and removes collision detection to double potential throughput. As with any other network 
equipment upgrade, it is only effective if the network bandwidth is not oversubscribed. 


Place the Server on the Other Side of a Network Bottleneck 


Simple server placement can sometimes eliminate network bottlenecks. You should perform 
network traffic analysis to identify a bottleneck, but some solutions are simple and logical. 


For example, 12 engineers are working on a project and need to share data. Each engineer 
has a workstation that connects to the network through a hub to the backbone. The engi- 
neers have access to a dedicated server named ENGSRV1. This server is located and main- 
tained in a server room, one among many in racks of servers. The engineers complain about 
slow response time when they save and access data on the server. System administrators ana- 
lyze server performance and determine the server is handling requests at an acceptable 
rate—so the problem does not appear to be in the hardware performance on ENGSRV1. 
The likely conclusion is a network bottleneck between the engineers and the server. 


A possible solution to this bottleneck is to swap the hub where the engineers are connected 
for a 16-port switch. You can connect the server (ENGSRV1) to the same switch with the 
engineer workstations and bypass the network bottleneck. This is a very simplified solution 
but introduces the logic to apply when monitoring and analyzing network performance. 


CHAPTER SUMMARY 


a Performance monitoring concepts are consistent across all platforms. The specific 
tools and functions vary according to operating system. 


I System Performance Monitor/2 (SPM/2) is designed to analyze hardware and soft- 
ware in the OS/2 environment. 


o UNIX/Linux text-based tools such as vmstat, ps, df, and top are used to monitor 
performance on Linux systems. You can also use any of several GUI tools, including 
the GNOME System Monitor, the Stripchart Plotter, and third-party tools such as 
Unicenter TNG from Computer Associates. 


I NetWare uses the Traffic Manager tool to monitor network traffic. Traffic Manager 
works with a Windows NT computer within the Performance Monitor tool. To 
monitor main system resources such as memory, CPU, and hard disk activity, use 
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the text-based NetWare Monitor utility from the command console. The Java-based 
ConsoleOne interface also includes a NetWare monitoring utility. To monitor web 
and FTP servers, use the Novell Internet Caching System (ICS). 


Performance monitoring on the Windows NT operating system uses a GUI tool 
named Performance Monitor, and the default view is Chart View. 


Performance monitoring in Windows 2000 uses the Microsoft Management 
Console (MMC) graphical interface to the System Monitor, but the objects, coun- 
ters, and instances are nearly identical to those in the Windows NT 4.0 
Performance Monitor tool. 


The Windows NT/2000 Performance Monitor uses objects, instances, and counters 
to measure performance on local servers or remote systems. Performance monitor- 
ing tools provide both real-time monitoring capability and logging facilities. 


A baseline is established by recording performance data when a server is healthy, or 
running normally. The best time to create a baseline is while the server is experi- 
encing maximum activity. 


When you monitor performance to detect a bottleneck, you are looking for the 
resource (processor, memory, etc.) that is causing the delay in the transmission of data. 


Baseline information for a server can also be used for capacity planning. 


Within a server there are limited resources that can affect the performance of a 
given system. Each of the resources work hand-in-hand and are capable of influ- 
encing the behavior of one another. 


Processor, Memory, PhysicalDisk, and Network Segment are the basic resources to 
track in performance monitoring. 


Processor utilization levels exceeding 65% on a consistent basis during performance 
monitoring usually indicate that the processor is the bottleneck in the system. 


SMP can provide improved performance by making multiple CPUs available to 
complete individual processes simultaneously. 


Writing data to compressed folders and using unnecessary encryption places an 
extra load on the CPU. 


Software RAID-5 can significantly diminish processor performance because the 
processor must spend resources to calculate the parity bit when writing data. 


Monitoring swap space is an important aspect of memory management to avoid 
out-of-memory errors. You can optimize the swap space by spreading it across mul- 
tiple disks or a RAID array. Avoid placing the swap space on the system disk. 


Page Faults/sec is a good indicator of excessive paging on a Microsoft server. 


The best method to determine when additional memory is justified is thorough 
performance monitoring. The ability to upgrade to more or faster RAM is depen- 
dent on the motherboard. 
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a An undetected memory leak can not only cause performance deterioration, but a 
system crash as well. 


o Hardware or software RAID arrays can significantly increase disk performance 
and provide fault tolerance. Software RAID-5 will increase performance on disk 
reads. Performance suffers on disk writes, however, due to processor-intensive 
parity calculations. 


o The most dramatic and fundamental improvement to the disk subsystem is to 
upgrade from IDE disks to SCSI disks. 


a Distributed file systems enable users to map file resources to a single file share point 
and transparently access resources from many physical locations. 


o Overall network utilization and server network utilization usually affect each other. 
o Multiple network cards can notably improve throughput to the server. 


o Port or link aggregation can double bandwidth to a single segment by combining 
throughput of two or more network cards on a single segment. 


o Upgrading a network card from 10 Mbps to 100 Mbps can boost server throughput 
and performance. 


a Switches can replace hubs to improve network throughput and performance. 


o In Ethernet networks, reducing collisions dramatically improves throughput. 


KEY TERMS 


baseline — A collection of data that establishes acceptable performance. You compare 
variances in performance against the baseline to determine if perceived perfor- 
mance issues are real. 

bottleneck — One or more system components that hinder the performance of the 
rest of the system. Other system components must wait for the bottleneck item to 
complete its task before resuming activity. 

compression — Data formatted to use less storage space than unformatted data. 

counter — In Windows NT and 2000 Performance Monitor, a subset of an object 
that measures a particular aspect of that object. 

drivers — One or more files loaded into the operating system to control a hardware 
device. 

instances — In Windows NT and 2000 Performance Monitor, a subset of object 
counters that distinguishes like objects from one another. For example, instances 
would apply to multiple processors, hard disks, or NICs. 

memory leak — A program that uses system memory but does not release it when 
finished. A memory leak consumes memory over time, and causes performance 
problems because more hard disk virtual memory is required. Eventually, memory 
leaks can cause a system to return out-of-memory messages or crash. 
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multithreading — Two or more simultaneously running program threads. 
Multithreading is useful for improving performance. Multithreading requires an 
operating system that can support this, and programmers must be careful to write 
applications so that threads do not interfere with one another. 

objects — In Windows NT/2000 Performance Monitor, resources such as Processor, 
Memory, PhysicalDisk, and Network Segment. 

process — A running program. 

System Performance Monitor/2 (SPM/2) — Tool used to measure performance 
statistics in the OS/2 operating system environment. 

threads — Program units of execution that can run separately from other threads. A 
thread is also the means by which an application accesses memory and processor time. 


REVIEW QUESTIONS 
1. Which of the following are components of OS/2 performance monitoring? 
a. Data Collection Facility 
b. System Monitor 
c. OS/390 
d. SPM/2 Monitor 


2. The UNIX tool shows resources currently running that con- 
sume the most memory. 


a. vmstat 

b. Logging Facility 
c. top 

d. Committed Bytes 


3. The Windows NT Performance Monitor relies on which two elements to mea- 
sure performance? 


a. objects 
b. cache 
c. services 
d. counters 


4. The counter Interrupts/sec is associated with which object in NT Performance 
Monitor? 


a. Memory 
b. PhysicalDisk 
c. Processor 
d. LogicalDisk 
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10. 


. Windows 2000 real-time performance is observed with the: 


a. vmstat utility 
b. System Monitor 
c. MMC 


d. performance logs 


. A test used to compare performance of hardware/software on servers is called a 


a. bottleneck 
b. network segment 
c. baseline 


d. page fault 


. One or more system components that hinder the performance of the rest of the 


system is known as a: 
a. bottleneck 

b. baseline 

c. multihoming 


d. port aggregation 


. To get the most effective comparisons, the best time to create a baseline is: 


a. between 12:00 A.M. and 6:00 A.M. 
b. during times of minimal activity 
c. immediately after rebooting 


d. during periods of maximum activity 


. Predicting server performance using a current baseline and future conditions is called: 


a. network planning 

b. capacity planning 

c. server planning 

d. performance planning 

When data is written to a compressed partition or folder, the processor must: 
a. remove compression before writing to the disk 

b. hold the data permanently in memory 

c. use multiple controllers 


d. work harder to compress the data before it is written 


11. 


12. 


13. 


14. 


15. 


16. 
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Data encryption affects performance because: 

a. more memory is required to hold the private key 

b. more hard disk space is required to store the encryption bits 

c. additional protocols are necessary to transmit encrypted data over the network 
d. the processor must perform calculations to encrypt and decrypt the data 


PC133 SDRAM is capable of ne with the and 
reaching clock speeds of 


a. CPU bus/600 MHz 
b. page file/133 MHz 

c. CPU bus/133 MHz 
d. serial port/600 MHz 


A is a bug in an application or program that prevents it from 
freeing up memory that it no longer needs. 


a. memory leak 
b. page fault 

c. SCSI 

d. cluster 


Software RAID-5 improves performance for operations. 


a. write 
b. delete 


c. read 


d. copy 


When files exist in noncontiguous pieces on a hard disk, the condition is known 
as 


a. disk performance 

b. defrag 

c. disk fragmentation 

d. IDE fault tolerance 

SCSI-3 supports data transfer rates up to 
a. 320 MBps 

b. 40 Mbps 

c. 133 MHz 

d. 600 MHz 
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17. The server named Infinity is approaching 80% disk capacity. There is no budget to 


18. 


19. 


20. 


increase disk space at this time. Four other servers are available and can reasonably 
increase file capacity 10—15%. What is a possible solution? 


a. add more memory 

b. upgrade to SCSI 

c. implement a distributed file system 
d. rename the server Finite 


After monitoring a heavily used file server for two hours, the network analysis 
shows the Output Queue Length averaged a value of 5 and never fell below 3. 
What is a possible solution? 


a. add more memory 

b. upgrade the processor 

c. upgrade the motherboard 
d. multihoming 


Five illustrators work in a remote office across the city from the main office. They 
each use Windows 2000 Professional workstations. Currently, each illustrator needs 
to access storyboards on a server in the main office. Only one person accesses the 
storyboard files from the main office. The illustrators consistently complain about 
slow access to the server. After monitoring the server, you find it is performing 
within acceptable parameters. What is a possible solution? 


a. SCSI controller for the server 

b. upgrade to faster memory 

c. move the server to the remote location 
d. print all files and hire a courier 


can replace to improve network through- 
put and performance. 


a. IDE/SCSI 
b. hubs/routers 


c. switches/controllers 
d. switches/hubs 
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HANDS-ON PROJECTS 


Note 


a | All projects in this chapter are designed for Windows 2000 Server or Professional. 


| Project 11-1 


In this project, you will monitor the Processor object on a Windows 2000 server. 


1. 


Oo ON D 


11. 


12. 


13. 


14. 


Click Start, point to Programs, point to Administrative Tools, and then click 
Performance. The MMC opens. 


. Click System Monitor in the left pane, if necessary. 


. Click the plus sign (+) in the row of buttons above the chart in the right pane. 


The Add Counters dialog box opens. 


. Click Use local computer counters. 


. From the drop-down list under the Performance object, click Processor, if it is 


not already selected. 


. Click % Processor Time, if it is not already selected. Click Add. 
. Click % User Time. Click Add. 

. Click % Privileged Time. Click Add. 

. Click % Interrupt Time. Click Add. Click Close. 


. Start several applications on the server, such as Paint, Word, and Pinball if available. If 


possible, start a utility that constantly accesses the processor (for example, a 3D 
screen saver) and preview it. Notice that the counters increase when the screen saver 
is running. (This is a very good reason not to run fancy screen savers on a server.) 


Click the first counter, % Processor Time. Press Ctrl+H. Notice that the 
processor line in the chart turns white, making it easier to identify when multiple 
counters are running at once. 


Click each of the other counters. Note that each chart line turns white as the 
counter is highlighted. 


Click % User Time. Press the Del key to remove the counter. Repeat for % 
Privileged Time and % Interrupt Time. The % Processor Time counter should remain. 


Leave System Monitor open for the next project. 
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Project 11-2 
Han In this project, you will monitor the basic resource objects as defined in this chapter. 


The Network Monitor Agent service must be installed to initiate the Network 
Segment object (Step 7). If necessary, add this using the Add/Remove 
Note 


alk 


no BW WN 


11. 


T2. 


13. 


Programs Control Panel item. Your instructor can help you add this. 


In System Monitor, click the (+) plus sign in the row of buttons above the chart 
in the right pane. The Add Counters dialog box opens. 


. Click Use local computer counters. 

. Click the Performance object list box and then click Memory. 

. From the counters list, select % Committed Bytes In Use. Click Add. 
. Click the Performance object list box and select PhysicalDisk. 


. From the counter list, click % Disk Time, if necessary. Click _Total from the 


instance list, if necessary, and then click Add. 


. Click the Performance object list box and select Network Segment. 


. From the counter list, click % Network Utilization. Click on the NIC for your 


subnet in the Instance list. Click Add. 


. Click Close. 
10. 


Start several applications on the server, such as Paint, Word, and Pinball if available. 
If possible, start a utility that constantly accesses the processor (for example, a 3D 
screen saver) and preview it. Notice that the counters increase when the screen 
saver is running. 


Click the first counter, % Processor Time. Press Ctrl+H. Notice that the 
processor line in the chart turns white, making it easier to identify when multiple 
counters are running at once. 


Click each of the other counters. Note that each chart line turns white as the 
counter is highlighted. 


Delete all counters and leave the MMC open for Project 11-3. 


ü Project 11-3 


Eaa In this project, you will configure a performance log using the basic resource objects. 


1. 
. Click Performance Logs and Alerts. 


mn Ae O N 


Start in the Performance MMC window that is still open from Project 11-2. 


. In the right pane, right-click Counter Logs and click New Log Settings. 
. Type Test Log in the Name text box, and click OK. The Test Log dialog box opens. 
. Click Add. The Select Counters dialog box opens. 
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6. Click Use local computer counters. 


7. Click Processor from the Performance Object list box. Click % Processor 
Time, and then click Add. 


8. Click Memory from the Performance Object list box. Click % Committed 
Bytes In Use, and then click Add. 


9. Select PhysicalDisk from the Performance Object list box. Click % Disk Time 
from the Counters list. Click _Total from the Instances list. Click Add. 


10. Click Network Segment from the Performance Object list box. Click % 
Network Utilization from the Counters list, and click on the NIC for your sub- 
net from the Instance list. Click Add, and then click Close. 


11. Set the Sample Data Interval to 5 seconds. 

12. Click the Log Files tab. Change the End file names with option to yyyymmdd. 
13. Click the Schedule tab. In the Start log frame, click Manually. 

14. In the Stop log frame, click the After option and enter 5 minutes. Click OK. 


15. Double-click Counter Logs in the right pane. Right-click Test Log, and 
click Start. 


16. Start several applications on the server, such as Paint, Word, and Pinball if available. If 
possible, start a utility that constantly accesses the processor (for example, a 3D 
screen saver) and preview it. Notice that the counters increase when the screen saver 
is running. (This is a very good reason not to run fancy screen savers on a server.) 


17. Wait a minimum of five minutes before starting the next exercise. The Test Log 
icon will turn red when logging is complete. (It may be necessary to refresh the 
screen by clicking the Refresh button or pressing F5.) 


18. Leave System Monitor open for the next project. 


4 Project 11-4 


In this project, you will view the performance log data from the previous project in the 
Chart Format. 


Project 


1. Click System Monitor in the left pane. Click the View Log File Data button 
(fourth button from the left). 


2. Click Test_Log and click Open. 
3. Click the (+) plus sign to Add Counters. The Add Counters dialog box opens. 


4. Add all objects and counters available. (Note that only the objects and counters 
selected for logging are available.) Click Close. By default, the data appears in the 
Chart format, which is useful for graphically viewing performance trends from 
one point in time to the next. 


5. Leave Performance Monitor open for the next project. 
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Project 11-5 


In this project, you will view data in the Histogram and Report formats. 


L 


2. 


3: 


With the data from Project 11-4 still displayed in Chart format, click the View 
Histogram button (sixth button from the left). This format is useful for viewing 
log data at a specific point in time. 


Click the View Report button (seventh button from the left). This format is useful 
for displaying exact numbers for the specified data. 


Leave the Performance Monitor open for the next project. 


Project 11-6 


In this project, you will create a performance alert, send a system message, and observe 
the results in Event Viewer. 


1. 


O ODN DN fF UDN 


pas ek 
Pe Oo 


12. 


13: 


14. 


In the left-hand pane of the Performance window, click Performance Logs and 
Alerts. Right-click Alerts, and then click New Alert settings. 


. Name the alert Processor. Click OK.The Processor dialog box opens. 

. Click Add. The Select Counters box opens. Select the Processor object. 

. Click the % Processor Time counter. Click Add. Click Close. 

. In the “Alert when the value is:” list box, choose Over. Enter 5 in the Limit box. 
. Choose to sample data every 20 seconds. 

. Click the Action tab. Accept the default to Log an entry in the application event log. 
. Check the box to send a network message. Enter your computer name. 

. Click the Schedule tab. Click the Start scan Manually radio button. 

. Choose Stop scan After 1 minute. Click OK. 

. Right-click the Processor alert. Select Start. Open several applications to 


increase processor time. You will begin receiving system messages. Click OK to 
acknowledge the messages. Wait one minute. 


Click Start, point to Programs, point to Administrative Tools, and click 
Event Viewer. 


Click the Application Log. Note the messages indicating that the processor 
exceeded the limit set in the alert. 


Close all open windows. 
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CASE PROJECTS 


PS 1. It’s your first week on the job as the administrator. Complaints roll in that logon 
pI is slow. When you ask for performance logs on logons, you get blank looks except 
aie for one guy who tried once but couldn’t remember where the data was saved. You 


decide to monitor some performance elements yourself, in order to identify what 
the bottleneck might be. 


Based on the data in Table 11-6, what is the likely bottleneck and a possible solution? 


Table 11-6 Sample Data for Case Project 11-1 (Values Are Averages Based on 
One-Minute Monitor Times) 


Counter 


Processor % Privileged Time 


Processor % User Time 


Memory % Committed Bytes in Use 


Server Logon Total 
Server Logon/sec 


Network Segment % Network Utilization 


2. Day 2 on the new job and you are downloading files from the primary 


file/print server. While no one has complained about the slow downloads, the 
performance is not acceptable to you. You use Performance Monitor to establish 
a baseline. Based on the data in Table 11-7, identify the most likely bottleneck 
and suggest a solution. 


Table 11-7 Sample Data for Case Project 11-2 (Values Are Averages Based on 
30-Minute Performance Log) 


Counter 


Processor % Privileged Time 


Processor % User Time 


Memory Page Faults/sec 
PhysicalDisk % Disk Time 


Network Segment % Network Utilization 
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3. The salesman for a new integrated contact management software package is tout- 
ing the virtues of his wares to your marketing manager. The salesman leaves 
behind an evaluation copy of the server software. The marketing manager wants 
to try it out immediately, of course. Like a smart administrator, you install the 
software on a test system. The next morning, you record a performance log to 
track the impact of the software. Based on the data in Table 11-8, is there a bot- 
tleneck? Is there any evidence of potential problems? 


Table 11-8 Sample Data for Case Project 11-3 (Values Are Averages Based on 
240-Minute Performance Log) 


Counter 


Processor % Privileged Time 12% 


Processor % User Time 9% 
Memory % Committed Bytes in Use 22-53% 
PhysicalDisk % Disk Time 17% 
Network Segment % Network Utilization 8% 


